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1. INTRODUCTION 

Vehicular ad hoc networks (VANETs) serve as a fundamental component of intelligent transportation 
systems (ITS), enabling seamless communication between vehicles and other entities [1], [2]. When compared 
to traditional ITS, the VANETs provides a wider range of services. This is because VANET can communicate 
not just with one another, but also with base stations located along the road. Network-based VANET’s 
communication capabilities between vehicle to vehicle (V2V), roadside infrastructure and other vehicles (V2I) 
are its main advantages. Utilising the communication platforms and protocols offered by the supporting 
infrastructure makes this possible [3]-[5]. VANETs have the potential to enhance safety, comfort, and overall 
road efficiency. VANETs consist of interconnected vehicles that can form networks even without prior 
knowledge of each other [6]. The contemporary systems for traffic control and road safety make extensive use 
of these networks. Through the exchange of basic safety messages (BSMs), for instance, crucial information 
like the present location, speed, and acceleration can be transmitted between automobiles, improving eyesight 
and awareness of the situation for the near surroundings [7], [8]. 

Furthermore, VANETs enable vehicles to communicate safety and traffic-related information with 
each other while on the road. VANETs are distinguished by numerous nodes (vehicles) and constant 
movement. The security of VANETs is essential for establishing a reliable ITS that ensures safety. According 
to World Health Organisation (WHO), road accidents are ranked as the eighth major contributor to global 
fatalities [9], [10]. Next, VANETs are understood to be essential tools for preventing traffic accidents and 


Journal homepage: http://ijeecs.iaescore.com 


Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 O 1427 


fatalities. Vehicles can transmit and receive data packets from roadside units (RSUs) within these networks in 
addition to exchanging data packets among themselves. Because VANETs generate and communicate such a 
significant amount of data, applying machine learning techniques is advantageous because they may be applied 
to a variety of goals [11]-[13]. Artificial intelligence (AD) refers to the ability of a computer or robot to perform 
tasks that are typically associated with intelligent beings, guided by computer programming [14]. The primary 
aim often involves creating systems that possess intellectual capabilities like those demonstrated by humans. 
These capabilities encompass reasoning, comprehension, generalization, and the ability to derive meaning or 
learn from prior experiences [15]—[17]. Many facets of society could be transforms by AI, from bettering 
healthcare outcomes to enabling more effective transportation systems such as VANET. Besides, different 
methods, such as rule-based systems, machine learning, deep learning, and evolutionary computation, can be 
used to create them [18], [19]. 

Recently, machine learning has been used in wireless networks to offer a data-driven method to 
solving formerly difficult challenges [20]. This has led to the development of some solutions to enhance the 
performance of autonomous vehicles (AV) and VANET. Machine learning can be used for VANET testing 
and training by applying machine learning algorithms. then determining which machine learning algorithm is 
best for VANET [21], [22]. During the training stage of machine learning, a model is developed by learning 
from the provided training data. This involves extracting patterns and relationships from the data to create a 
predictive model. In the subsequent testing stage, the trained model is utilized to make predictions or generate 
outputs based on new, unseen data. This two-step process allows us to harness the power of machine learning 
to address complex problems that were traditionally difficult to solve [23]-[27]. 


2. LITERATURE REVIEW 

Machine learning techniques can be categorized into three main divisions: reinforcement learning, 
unsupervised learning, and supervised learning [28], [29]. Among these, supervised learning is commonly used 
in practical applications and relies on labelled datasets. Traditional methods in this category include K-nearest 
neighbours (KNN), decision trees (DT), support vector machines (SVM), neural networks (NN), and random 
forests (RF) [30]-[33]. Unsupervised learning involves analysing data without prior knowledge or labels, 
focusing on either the specific characteristics of individual samples or the similarities between samples to group 
them together. Classic clustering techniques, such as k-means, hierarchical clustering, spectrum clustering, and 
dirichlet process, are commonly used in this context [34], [35]. Reinforcement learning aims to maximize the 
rewards obtained from interacting with an environment. It relies on the markov decision process (MDP) model, 
which revolves around taking actions and receiving corresponding rewards. 

Rashid et al. [36] presented the issue of detecting malicious nodes in VANET is addressed in this 
study, where a machine learning-based solution is developed for real-time identification of dangerous nodes. 
Furthermore, a distributed multi-layer classifier is proposed, and its performance is evaluated using objective 
modular network testbed in C++ (OMNeT++) and simulation of urban mobility (SUMO), along with machine 
learning models such as gradient boosting trees (GBT), logistic regression (LR), multilayer perceptron 
classifier (MLPC), RF, and SVM. The proposed model is designed to be applied to a dataset containing 
information about attacking and regular vehicles. To compare the performance of different machine learning 
models, GBT, LR, MLPC, RF, and SVM are employed. RF is particularly recommended for enhanced 
detection of misbehaviours. The results accurately forecast how effectively the suggested architecture manages 
the research within the VANET environment. 

Kezia and Anusuya [37] outlined to predict the local density of vehicles and the channel busy ratio, 
with the exclusion of beacon considerations. To achieve this, various machine learning techniques such as LR, 
KNN, Naive Bayes (NB), DT, and RF algorithms are utilized. The most accurate predictions are then utilized 
in a congestion control system based on prediction and adaptation. The results reveal that the RF classification 
technique achieves a remarkable 99.3% accuracy in predicting the number of cars. Additionally, the DT method 
effectively determines the channel busy ratio with a root mean square error (RMSE) of 0.018 for the predicted 
number of cars. By incorporating these predictions into the system, the adaptation of transmission parameters 
is improved, leveraging the calculations of local density and channel busy ratio (CBR). Consequently, this 
enhancement significantly enhances the overall network performance. 

Khan et al. [38] highlighted that enabling vehicles to make autonomous decisions is a fundamental 
requirement in VANET systems. Vehicles select the best path using cognitive memory, which saves all the 
past routes. Communication is key in networks. With cellular-based vehicle-to-everything (CV2X) 
communication, each vehicle broadcasts a cooperative awareness message (CAM) that is used to exchange 
critical information. Finding that results from the proposed SVM-CV2X-M4 system are precise. The vehicles 
can therefore choose their best path on their own. The proposed system is effective and simple to use thanks to 
a variety of strategies and algorithms. The planned SVM-CV2X-M4 system's goals are to assist users and make 
commuting easier. 
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Ajay and Kumaraswamy [39] demonstrated that advancements in vehicle safety measures have 
contributed to a reduction in traffic accidents. During congested periods, predicting traffic patterns becomes 
crucial in selecting optimal routes, resulting in time and fuel savings. This article focuses on traffic forecasting 
and utilizes various machine learning techniques to gather and analyse data. By employing a specialized 
machine learning model, the collected dataset significantly enhances the accuracy of traffic prediction 
compared to previous methods. The RF algorithm achieves an impressive 97.82 % of accuracy rate in 
predicting traffic data [34]. 

Alhaidari and Alrehan [40] introduced a state-of-the-art simulation technique to generate a reliable 
distributed denial of service dataset specifically tailored for automotive ad hoc networks. The architecture, 
traffic volume, attack power, and node mobility of the VANET are all considered when analysing the 
distributed denial of service attack traffic in the VANET dataset. Popular simulation programme including 
SUMO, OMNeT++, Veins, and internet networking (INET). Finding that apart from SVM, which displays an 
accuracy of 97%, the classifiers reported by VANET distributed denial of service dataset (VDDD) obtained 
excellent detection accuracies that were typically better than 99%. When compared to other applicable machine 
learning classifiers, the RF classifier has the greatest accuracy (99.7%). 

Alsarhan et al. [41] incorporated SVM algorithm to detect intrusions in VANET. Numerous 
computational benefits of the SVM structure include direction at a finite sample and independence of algorithm 
complexity to sample dimension. Finding that based on the network security laboratory—knowledge discovery 
in databases (NSL-KDD) data, this article simulates the interactions between the driver and the VANET, as 
well as the communications between the vehicles and RSUs of the VANET. The KDD99 version of NSL-KDD 
data has been cleaned up and refined. It fixes a few of the KDD99’s fundamental issues. The entire KDD 
dataset is contained in NSL-KDD. Internet traffic that was handled by an actual intrusion detection system is 
contained in the records. 

Gad et al. [42] highlighted the extensive utilization of the NSL-KDD and KDD-CUP99 datasets in 
various studies. Nevertheless, these datasets suffer from a limitation of not encompassing up-to-date attack 
information. In our research, we utilized a realistic dataset called telemetry data, operating data, network data 
from internet of things (ToN-IoT), which was obtained from a large-scale, diverse internet of things (IoT) 
network. We experimented with various machine learning techniques to address both binary and multi-class 
classification challenges. The findings demonstrate that employing the Chi2 feature selection method and 
synthetic minority over-sampling technique (SMOTE) leads to meaningful results when using KNN algorithm. 
Furthermore, the outcomes for all applied machine learning techniques using the Chi2 feature selection and 
SMOTE techniques were evaluated. 

Kadam and Sekhar [43] proposed a new secure framework for identifying distributed denial of service 
(DDoS) using a novel hybrid KSVM scheme, which combines KNN and SVM algorithms. attacks is 
established The research considers multiple machine learning approaches and evaluates the proposed hybrid 
kernel support vector machine (KS VM) algorithm in terms of accuracy, sensitivity, precision, recall, and error 
for detecting malicious DDoS activity. The results demonstrate that the hybrid KSVM algorithm outperforms 
other machine learning algorithms, making it a superior choice for DDoS attack detection. 

Sonker and Gupta [44] focused on detecting five main types of attacks in VANETs: constant attack, 
constant offset attack, random attack, random offset attack, and eventual attack. Various machine learning 
techniques are employed to identify these attacks. Initially, binary classification methods are utilized to detect 
each individual attack. Subsequently, a novel approach is developed to perform multi-classification of the 
attacks, using the best-performing machine learning algorithm identified from the previous step. The results 
show that NB, DT, and RF achieve 100% accuracy in binary classification for detecting all attacks. In terms of 
multi-classification, the RF technique achieves the highest accuracy of 97.62%. The vehicular reference 
misbehaviour (VeReMi) dataset, which is a publicly available repository for detecting malicious nodes in 
VANETs, is utilized in this research. 

Wang et al. [45] addressed the escalating issue of severe traffic congestion resulting from the growing 
number of vehicles, leading to increased travel time and financial burden on transportation users. To some 
extent, forecasting future traffic patterns can help to solve this issue. In this study, it employs three machine 
learning algorithms, namely LR, DT, and SVM, to simulate traffic flow. Prior to conducting the simulation, it 
pre-processes the data using Selenium, open-source software (OSS), and a message queue. next, it examines 
Beijing’s actual traffic information as displayed on the Baidu map. This study demonstrates that RF has the 
highest accuracy of the three approaches, at 0.719. The second is SVM and LR. 

Anand and Sankhe [46] discussed the various applications of traffic flow forecasting in effectively 
managing regional and urban traffic. Traffic control in large cities is quite challenging. However, it was 
discovered that considering certain physical factors like the surroundings and weather made the forecasts more 
accurate. In this work, a traffic flow forecasting system model is created to anticipate traffic data at intervals 
of between one hour and twenty-four hours. In the past, research using predictive algorithms was also carried 
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out, but it was discovered that not many systems that make it simple to forecast traffic flow and allow access 
to general consumers. The system is made to address issues with time series and history. Datasets of previous 
traffic were gathered from public sources and cleansed upon request. A system that collects data from the road 
using vehicle detecting sensors and storing it in a database for future predictions was developed using machine 
learning algorithms. To retrieve weather information, it additionally incorporated the weather service’s 
application programming interface (API). This traffic flow forecasting model was developed using artificial 
neural networks (ANN) and SVM, two of the most popular machine learning forecasting approaches currently 
in use. 

Le and Maple [47] describe three novel elements, discovered through research on n consecutive 
locations, that play a crucial role in determining a vehicle’s behavior. Two machine learning algorithms which 
is KNN and SVM. Those machine learning algorithm use these attributes to detect assaults in the VeReMi 
dataset. It demonstrates that the overall precision rates may reach. Besides, this paper has four different 
algorithms were examined, and it was discovered that KNN and RF classifiers produce the best outcomes. 
Then, using the same VeReMi dataset, we compared the outcomes of our strategy (which makes use of KNN) 
with those of other current strategies that are listed in the literature. According to the findings, the suggested 
strategy significantly surpasses the currently used methods for overall attack detection in terms of both 
precision and recall. 

Mythili and Magendran [48] discussed the limitations associated with unscrupulous individuals who 
attempt to manipulate information for their own purposes, emphasizing that such information exchange is not 
always reliable or accurate. However, how to identify numerous attacks in the VANET and how to increase 
security while maintaining the level of service quality in the VANET becomes a very challenging problem. 
Finding that the outcome of the experiment demonstrated how effective the suggested algorithm was in 
detecting malicious nodes. The accuracy of the SVM algorithm is 97.46%, and the accuracy of the enhanced 
SVM algorithm is 98.01%. However, for multiple attack detection in VANET, powerful optimisation-based 
clustering method and ensemble SVM classifier beat previous techniques by offering 98.99% accuracy. 

Shi and Wu [49] proposed a machine learning algorithm-based model for detecting road functionality 
based on data aggregation in VANETs. Numerous simulations are used to assess the scheme’s performance. 
The simulation findings demonstrate that the method of combining the data from moving cars may precisely 
identify the state of a dysfunctional road and pinpoint a specific malfunctioning place, especially for large 
dimensions. Finding that to identify the aberrant traffic state, a road functionality detection model based on 
VANET, and machine learning has been built. In the simulation, SVM and deep neural network (DNN) are 
compared to determine the dysfunctional road state, and DNN is then used to estimate where the defect will be 
exactly. The evaluation's results indicate that it is possible to accurately determine the present traffic situation 
by merging various data about the moving items on the road. 

So et al. [50] introduced a system framework that employs plausibility checks as a feature vector for 
machine learning models in order to detect and classify misbehavior. The results indicate that by incorporating 
KNN and SVM, the overall precision of detecting misbehaviour using plausibility checks in the feature vectors 
can be increased by more than 20%, while maintaining a recall rate of less than 5%. The research findings 
demonstrate that our proof-of-concept misbehaviour detection system, which combines machine learning and 
plausibility checks, achieves a 20% improvement in precision while keeping the recall within a 5% range of 
the recall achieved by plausibility checks alone. This framework provides a system that not only identifies 
inappropriate behaviours but also categorizes them, enabling more targeted corrective actions. Table | (see 
Appendix) shows the existing works of machine learning approach on VANET deployment. 


3. RESEARCH GAPS AND DISCUSSION 
3.1. Datasets 

One of the crucial research gap exists in the realm of VANET research due to the scarcity of 
standardized and comprehensive benchmark datasets. While studies utilize various datasets such as VeReMi, 
NSL-KDD, and ToN-IoT, there is a need for standardized datasets that encompass a wide range of VANET 
scenarios and capture real-world traffic patterns, mobility, and attacks. The availability of high-quality datasets 
would enable researchers to compare and validate their proposed solutions effectively. Collaboration among 
researchers, industry partners, and governmental agencies is crucial for establishing standardized benchmark 
datasets for VANET research. The datasets should capture various real-world scenarios, including different 
traffic patterns, mobility patterns, and attack scenarios. Creating a repository for sharing these benchmark 
datasets would facilitate progress in the field and ensure fair and accurate comparisons among different 
research endeavors. 
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3.2. Performance 

A notable research gap in VANET applications lies in the absence of comprehensive comparative 
evaluations of machine learning algorithms. Although studies employ different algorithms such as GBT, LR, 
MLPC, RF, and SVM, there is a need for rigorous benchmarking to determine their accuracy, efficiency, and 
scalability. Comparative evaluations would provide valuable insights into the strengths and weaknesses of 
different approaches and help identify the most suitable algorithms for specific VANET scenarios. To address 
this research gap, it is important to emphasize the need for standardized datasets, evaluation criteria, and factors 
such as real-time performance in order to ensure fair and meaningful comparisons. By conducting 
comprehensive benchmarking studies, researchers can contribute to advancing the field of VANETs and 
facilitate the development of more effective and efficient machine learning solutions. 


3.2. Security 

Another research gap pertains to the limited consideration of diverse attack scenarios in VANETs. 
While some studies focus on detecting specific types of attacks, there is a need for a comprehensive exploration 
of various attack scenarios to develop robust security mechanisms. Understanding and addressing different 
attack types, including DDoS, misbehaving nodes, and intrusion detection, are crucial for ensuring the security 
and integrity of VANETs. Developing a taxonomy specific to VANETs and systematically investigating each 
attack category can provide valuable insights for the development of effective security measures. Furthermore, 
collaboration with industry partners and governmental agencies is essential to gather real-world attack data, 
enabling more accurate analysis and testing of security mechanisms. By addressing this research gap, VANETs 
can be better safeguarded against a wide range of attacks, ensuring their integrity and reliability. 


3.4. Scalability and resource constraints 

The scalability and resource constraints of VANETs present a significant research gap. As the number 
of vehicles in VANETs increases, issues such as network congestion, energy consumption, and limited 
bandwidth become critical factors that need to be addressed. Existing research often lacks comprehensive 
solutions that can handle the scaling challenges of vehicular networks while optimizing resource utilization 
and maintaining system performance and security. Thus, approaches such as distributed computing, edge 
computing, or cloud-assisted solutions should be incorporated to handle the scalability challenges of vehicular 
networks. These approaches can leverage distributed resources, optimize energy consumption, and enhance 
overall system performance. Additionally, energy-efficient algorithms could be employed to optimize resource 
utilization without compromising system performance or security. 


4. CONCLUSION 

In conclusion, machine learning techniques have emerged as valuable tools in addressing the 
challenges of VANETs. These techniques have been applied to improve security, predict traffic patterns, 
enhance network performance, and detect malicious nodes in VANETs. The literature review highlights the 
effectiveness of algorithms such as SVM, RF, DT, and ensemble classifiers in achieving accurate results for 
different VANET applications. As VANETs continue to play a crucial role in ITS, integrating machine learning 
techniques will further enhance vehicular networks safety, efficiency, and overall performance. Future research 
can explore machine learning algorithms, hybrid models, and optimization techniques to tackle complex 
challenges in VANETs and improve their functionality. By harnessing the capability of machine learning, 
VANETs have the potential to revolutionize the transportation industry and pave the way for safer, smarter, 
and more efficient road networks. 


APPENDIX 
Table 1. Existing work of machine learning approach on VANET deployment 
Study Features Algorithm Training set Accuracy 
Rashid et al. [36] DDoS SVM OMNeT++, SUMO++ and Veins 97.00% 
RF 98.00% 
Kezia and Anusuya [37] Traffic KNN CBR 75.18% 
RF 99.30% 
DT 86.22% 
Khan et al. [38] Traffic SVM SVM-CV2X-M4 system 99.60% 
Ajay and Kumaraswamy [39] Traffic RF Traffic condition 82.33% 
SVM 97.82% 
90.67% 
Alhaidari and Alrehan [40] DDoS SVM VDDD 97.36% 
RF 99.75% 
KNN 99.73% 
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Table 1. Existing work of machine learning approach on VANET deployment (continue) 


Study Features Algorithm Training set Accuracy 
Alsarhan et al. [41] Misbehavior GA-SVM NSL-KDD 98.00% 
detection PSO-SVM 94.00% 
ACO-SVM 89.00% 
Gad et al. [42] Intrusion KNN ToN-IoT (binary classification) 86.80% 
detection system SVM 98.80% 
RF 97.90% 
Kadam and Sekhar [43] DDoS KNN Hybrid KSVM 86.00% 
SVM 92.00% 
Sonker Gupta [44] Misbehavior KNN VeReMi (type | attack) 99.10% 
detection RF 100% 
Wang et al. [45] Traffic SVM Traffic condition 71.90% 
RF 65.70% 
Anand and Sankhe [46] Traffic ANN Performance measure system 99.00% 
SVM (PEMS) 99.00% 
Le and Maple [47] Misbehavior KNN VeReMi - 
detection SVM 
Mythili and Magendran [48] Attack detection SVM - 97.46% 
Shi and Wu [49] Traffic SVM Planung transport Verkehr- 84.00% 
DNN verkehr in stadten simulations 87.10% 
model (PTV-VISSIM) 
So et al. [50] Misbehavior KNN - - 
detection SVM 
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